Voice Conversion Using GMM with Enhanced Global Variance
نویسندگان
چکیده
The goal of voice conversion is to transform a sentence said by one speaker, to sound as if another speaker had said it. The classical conversion based on a Gaussian Mixture Model and several other schemes suggested since, produce muffled sounding outputs, due to excessive smoothing of the spectral envelopes. To reduce the muffling effect, enhancement of the Global Variance (GV) of the spectral features was recently suggested. We propose a different approach for GV enhancement, based on the classical conversion formalized as a GV-constrained minimization. Listening tests show that an improvement in quality is achieved by the proposed approach.
منابع مشابه
An improved one-to-many eigenvoice conversion system
We have previously developed a one-to-many eigenvoice conversion (EVC) system enabling the conversion from a specific source speaker’s voice into an arbitrary target speaker’s voice. In this system, eigenvoice Gaussian mixture model (EV-GMM) is trained in advance with multiple parallel data sets composed of utterance pairs of the source and many pre-stored target speakers. The EV-GMM is effecti...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Sequential Voice Conversion Using Grid-Based Approximation
The goal of voice conversion is to modify a source speaker’s speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian Mixture Modeling (GMM), which require exhaustive training (typically lasting hours), often leading to ill-conditioning, if the dataset used is too small. Additionally, the training process is based on a one-to-one match between the source...
متن کاملUsing Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملGrid-based approximation for voice conversion in low resource environments
The goal of voice conversion is to modify a source speaker’s speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian mixture modeling (GMM). They aim to statistically model the spectral structure of the source and target signals and require relatively large training sets (typically dozens of sentences) to avoid over-fitting. Moreover, they often lead to...
متن کاملDoctoral Thesis Techniques for Improving Voice Conversion Based on Eigenvoices
Voice conversion (VC) is a technique for converting a source speaker’s voice into another speaker’s voice without changing linguistic information. As a typical approach to VC, a statistical method based on Gaussian mixture model (GMM) is used widely. A GMM is trained as a conversion model using a parallel data set composed of many utterance-pairs of source and target speakers. Although this fra...
متن کامل